R programming: generics

Published

March 21, 2025

M1 MIDS/MFA/LOGOS

Université Paris Cité

Année 2024

Course Homepage

Moodle

Code
stopifnot(
  require(Hmisc),
  require(skimr),
  require(patchwork),
  require(ggforce),
  require(glue),
  require(ggfortify),
  require(broom),
  require(tidyverse)
)

tidymodels::tidymodels_prefer(quiet = TRUE)

old_theme <-theme_set(
  theme_minimal(base_size=9, 
                base_family = "Helvetica")
)
Code
gc <- options(ggplot2.discrete.colour="viridis")
gc <- options(ggplot2.discrete.fill="viridis")
gc <- options(ggplot2.continuous.fill="viridis")
gc <- options(ggplot2.continuous.colour="viridis")

Objectives

Code
stopifnot(
  require(rlang),
  require(lobstr),
  require(sloop),
  require(devtools),
  require(usethis),
  require(testthat),
  require(generics)
)
Loading required package: rlang
Loading required package: lobstr
Loading required package: sloop
Loading required package: devtools
Loading required package: usethis
Loading required package: testthat
Loading required package: generics

Generics and S3 classes

Let us first create an instance of class lm.

Code
lm0  <- lm(Gas ~ Insul * Temp, MASS::whiteside)
Question
  • What does function class() do?
  • Is it possible to belong to type list and to class lm simultaneously?
  • In R what is an attribute?
  • How do we set and get attributes?
  • What does function inherits() do?
Question

Load package sloop.

  • What does sloop::otype() do? Apply it to an object of class lm.
  • What happens when we first unclass() the object?
Question

sloop exports functions s3_class() and s3_get_method()

  • Apply s3_class() to all members of lm0
  • What is the otype of autoplot() applied to an object of class lm?
Question
Question
Question
Question

Programming with dplyr and ggplot2

We first aim at programming a function that takes as input a dataframe df, a column name col, and that, depending on the type of the column denoted by col, plots a histogram (for numerical column), a barplot (for factors), or raise an error of the column is neither categorical, nor numerical.

The function should return a ggplot object.

Here is a first attempt.

Let us first build a toy tibble.

Code
tb <- tibble( 
  col_num = rnorm(100), 
  col_fac = as_factor(sample(letters, 100, replace = T)), 
  col_ts = Sys.time() + duration(sample(1:20, 100, replace=T),units="days")
) 

tb |> 
  head()
# A tibble: 6 × 3
  col_num col_fac col_ts             
    <dbl> <fct>   <dttm>             
1   0.908 g       2025-04-10 16:51:54
2  -0.105 c       2025-03-26 15:51:54
3   0.598 h       2025-04-03 16:51:54
4   1.65  t       2025-04-04 16:51:54
5  -0.618 b       2025-03-26 15:51:54
6   0.318 i       2025-04-10 16:51:54
Question
  • Pass more optional arguments to geom_... (use ellipsis ...)
  • Avoid quoting the column name
Question
Question

How could you add a geom_point() layer to each element of the following list?

Code
plots <- list(
  ggplot(mpg, aes(displ, hwy)),
  ggplot(diamonds, aes(carat, price)),
  ggplot(faithfuld, aes(waiting, eruptions, size = density))
)

From R Advanced Programming

Inside lm()

Question

In classes like lm, prcomp, … we have a member called call. What does it represent? How is it constructed?

First, read the code of lm.

> lm 
function (formula, data, subset, weights, na.action, method = "qr", 
    model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
    contrasts = NULL, offset, ...) 
{
    ret.x <- x
    ret.y <- y
    cl <- match.call()
    mf <- match.call(expand.dots = FALSE)
    m <- match(c("formula", "data", "subset", "weights", "na.action", 
        "offset"), names(mf), 0L)
    mf <- mf[c(1L, m)]
    mf$drop.unused.levels <- TRUE
    mf[[1L]] <- quote(stats::model.frame)
    mf <- eval(mf, parent.frame())
    if (method == "model.frame") 
        return(mf)
    else if (method != "qr") 
        warning(gettextf("method = '%s' is not supported. Using 'qr'", 
            method), domain = NA)
    mt <- attr(mf, "terms")
    y <- model.response(mf, "numeric")
    w <- as.vector(model.weights(mf))
    if (!is.null(w) && !is.numeric(w)) 
        stop("'weights' must be a numeric vector")
    offset <- model.offset(mf)
    mlm <- is.matrix(y)
    ny <- if (mlm) 
        nrow(y)
    else length(y)
    if (!is.null(offset)) {
        if (!mlm) 
            offset <- as.vector(offset)
        if (NROW(offset) != ny) 
            stop(gettextf("number of offsets is %d, should equal %d (number of observations)", 
                NROW(offset), ny), domain = NA)
    }
    if (is.empty.model(mt)) {
        x <- NULL
        z <- list(coefficients = if (mlm) matrix(NA_real_, 0, 
            ncol(y)) else numeric(), 
                  residuals = y, 
                  fitted.values = 0 * y, 
                  weights = w, 
                  rank = 0L, 
                  df.residual = if (!is.null(w)) sum(w != 0) else ny
              )
        if (!is.null(offset)) {
            z$fitted.values <- offset
            z$residuals <- y - offset
        }
    }
    else {
        x <- model.matrix(mt, mf, contrasts)
        z <- if (is.null(w)) 
            lm.fit(x, y, offset = offset, singular.ok = singular.ok, 
                ...)
        else lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, 
            ...)
    }
    class(z) <- c(if (mlm) "mlm", "lm")
    z$na.action <- attr(mf, "na.action")
    z$offset <- offset
    z$contrasts <- attr(x, "contrasts")
    z$xlevels <- .getXlevels(mt, mf)
    z$call <- cl
    z$terms <- mt
    if (model) 
        z$model <- mf
    if (ret.x) 
        z$x <- x
    if (ret.y) 
        z$y <- y
    if (!qr) 
        z$qr <- NULL
    z
}
<bytecode: 0x55564224e930>
<environment: namespace:stats>
Question

Have a look at function match.call()

Data masking and environments

Question

Tidy evaluation

Question

What is quasi-quotation?

Keep the rlang cheatsheet around.

Question

Explain the difference between an expression and a quosure

Question

References

Programming with ggplot